Efficient Computation and Visualization of Multiple Density-Based Clustering Hierarchies

نویسندگان

چکیده

HDBSCAN*, a state-of-the-art density-based hierarchical clustering method, produces organization of clusters in dataset w.r.t. parameter mpts. While small change mpts typically leads to the structure, choosing “good” value can be challenging: depending on data distribution, high or low may more appropriate, and certain reveal themselves at different values. To explore results for range values, one has run HDBSCAN* each independently, which computationally impractical. In this paper, we propose an approach efficiently compute all hierarchies values by building upon from computational geometry replace HDBSCAN*'s complete graph with smaller equivalent graph. An experimental evaluation shows that our obtain over hundred cost running about twice, corresponds speedup than 60 times, compared independently many times. We also series visualizations allow users analyze collection along case studies illustrate how these analyses are performed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Anytime Density-based Clustering

Many clustering algorithms suffer from scalability problems on massive datasets and do not support any user interaction during runtime. To tackle these problems, anytime clustering algorithms are proposed. They produce a fast approximate result which is continuously refined during the further run. Also, they can be stopped or suspended anytime and provide an answer. In this paper, we propose a ...

متن کامل

Animated visualization of multiple intersecting hierarchies

We describe a new information structure composed of multiple intersecting hierarchies, which we call a Polyarchy. Visualizing polyarchies enables use of novel views for discovery of relationships which are very difficult using existing hierarchy visualization tools. This paper will describe the visualization design and system architecture challenges as well as our current solutions. Visual Pivo...

متن کامل

Density-Based Method for Clustering and Visualization of Complex Data

In this paper the topic of clustering and visualization of the data structure is discussed. Authors review currently found in literature algorithmic solutions that deal with clustering large volumes of data, focusing on their disadvantages and problems. What is more the authors introduce and analyze a density-based algorithms called OPTICS (Ordering Points To Identify the Clustering Structure) ...

متن کامل

Improvement of density-based clustering algorithm using modifying the density definitions and input parameter

Clustering is one of the main tasks in data mining, which means grouping similar samples. In general, there is a wide variety of clustering algorithms. One of these categories is density-based clustering. Various algorithms have been proposed for this method; one of the most widely used algorithms called DBSCAN. DBSCAN can identify clusters of different shapes in the dataset and automatically i...

متن کامل

DenPEHC: Density peak based efficient hierarchical clustering

Existing hierarchical clustering algorithms involve a flat clustering component and an additional agglomerative or divisive procedure. This paper presents a density peak based hierarchical clustering method (DenPEHC), which directly generates clusters on each possible clustering layer, and introduces a grid granulation framework to enable DenPEHC to cluster large-scale and high-dimensional (LSH...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering

سال: 2021

ISSN: ['1558-2191', '1041-4347', '2326-3865']

DOI: https://doi.org/10.1109/tkde.2019.2962412